Algorithmic patterns for H-matrices on many-core processors

نویسنده

  • Peter Zaspel
چکیده

In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H matrix library, highlighting profound speedups of our many-core parallel approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix oper...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Structured Parallel Programming with Deterministic Patterns

Many-core processors target improved computational performance by making available various forms of architectural parallelism, including but not limited to multiple cores and vector instructions. However, approaches to parallel programming based on targeting these low-level parallel mechanisms directly leads to overly complex, non-portable, and often unscalable and unreliable code. A more struc...

متن کامل

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...

متن کامل

Some rank equalities for finitely many tripotent matrices

‎A rank equality is established for the sum of finitely many tripotent matrices via elementary block matrix operations‎. ‎Moreover‎, ‎by using this equality and Theorems 8 and 10 in [Chen M‎. ‎and et al‎. ‎On the open problem related to rank equalities for the sum of finitely many idempotent matrices and its applications‎, ‎The Scientific World Journal 2014 (2014)‎, ‎Article ID 702413‎, ‎7 page...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1708.09707  شماره 

صفحات  -

تاریخ انتشار 2017